Learning Information Status of Discourse Entities

نویسنده

  • Malvina Nissim
چکیده

In this paper we address the issue of automatically assigning information status to discourse entities. Using an annotated corpus of conversational English and exploiting morpho-syntactic and lexical features, we train a decision tree to classify entities introduced by noun phrases as old, mediated, or new. We compare its performance with hand-crafted rules that are mainly based on morpho-syntactic features and closely relate to the guidelines that had been used for the manual annotation. The decision tree model achieves an overall accuracy of 79.5%, significantly outperforming the hand-crafted algorithm (64.4%). We also experiment with binary classifications by collapsing in turn two of the three target classes into one and retraining the model. The highest accuracy achieved on binary classification is 93.1%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning the Information Status of Noun Phrases in Spoken Dialogues

An entity in a dialogue may be old, new, or mediated/inferrable with respect to the hearer’s beliefs. Knowing the information status of the entities participating in a dialogue can therefore facilitate its interpretation. We address the under-investigated problem of automatically determining the information status of discourse entities. Specifically, we extend Nissim’s (2006) machine learning a...

متن کامل

Learning the Fine-Grained Information Status of Discourse Entities

While information status (IS) plays a crucial role in discourse processing, there have only been a handful of attempts to automatically determine the IS of discourse entities. We examine a related but more challenging task, fine-grained IS determination, which involves classifying a discourse entity as one of 16 IS subtypes. We investigate the use of rich knowledge sources for this task in comb...

متن کامل

The Effects of Discourse Cues on Garden Path Processing

We report a self-paced reading study that investigated gardenpath sentences like While the boy washed {a/the} dog barked loudly and While the man hunted {a/the} deer ran into the woods. In such sentences, the critical noun phrase (dog, deer) tends to be misparsed as an object of the preceding verb, and has to be re-analyzed as a subject of the following clause when the disambiguating verb (e.g....

متن کامل

The Computation of the Informational Status of Discourse Entities

During language production, processes of information structuring constitute a relevant part. These processes are regarded as a mapping from a conceptual structure to a perspective semantic structure. I will focus on one aspect of i n f o r m a t i o n s t ruc tu r ing , n a m e l y the ve rba l i za t i on o f the cu r r en t menta l representation of entities. For this verbalization, the infor...

متن کامل

A Framework For Annotating Information Structure In Discourse

We present a framework for the integrated analysis of the textual and prosodic characteristics of information structure in the Switchboard corpus of conversational English. Information structure describes the availability, organisation and salience of entities in a discourse model. We present standards for the annotation of information status (old, mediated and new), and give guidelines for ann...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006